A Data Matrix code is a two-dimensional matrix barcode consisting of black and white "cells" or modules arranged in either a square or rectangular pattern. The information to be encoded can be text or raw data. Usual data size is from a few bytes up to 1556 bytes. The length of the encoded data depends on the symbol dimension used. Error correction codes are added to increase symbol strength: even if they are damaged, they can still be read. A Data Matrix symbol can store up to 2,335 alphanumeric characters.
Data Matrix symbols are rectangular in shape and usually square, they are made of cells: little elements that represent bits. Depending on the situation a "light" module is a 0 and a "dark" module is a 1, or vice versa. Every Data Matrix is composed of two solid adjacent borders in an "L" shape (called the "finder pattern") and two other borders consisting of alternating dark and light "cells" or modules (called the "timing pattern"). Within these borders are rows and columns of cells encoding information. The finder pattern is used to locate and orient the symbol while the timing pattern provides a count of the number of rows and columns in the symbol. As more data is encoded in the symbol, the number of cells (rows and columns) increases. Symbol sizes vary from 8×8 to 144×144.
Contents |
The most popular application for Data Matrix is marking small items, due to the code’s ability to encode fifty characters in a symbol that is readable at 2 or 3 mm2 and the fact that the code can be read with only a 20% contrast ratio. The Data Matrix is scalable, with commercial applications as small as 300 micrometres (laser etched on a 600 micrometre silicon device) and as large as a 1 metre (3 ft) square (painted on the roof of a boxcar). Fidelity of the marking and reading systems are the only limitation.
The United States of America's Electronic Industries Alliance (EIA) recommends using Data Matrix for labeling small electronic components.[1]
Data Matrix codes are part of a new traceability drive in many industries in the United States of America, particularly aerospace where quality control is tight and a black market exists for counterfeit or non-serviceable parts. Data Matrix codes (and accompanying alpha-numeric data) identify details of the component, including manufacturer ID, part number and a unique serial number. The US Department of Defense has selected Data Matrix for the mandatory unique identification of certain assets it procures for all of the services. Items from individual weapons to critical components of major systems must be permanently marked with a unique data matrix code in accordance with standards in Military Standard 130. Much of the Aerospace Industry, especially members of the Air Transport Association (ATA), aims to have all components of every new aircraft identified by Data Matrix codes within a tight deadline.[2]
The Data Matrix format is used by Semacode to encode 4096 bits RSA private keys that can be read by cameras or scanners.
Data Matrix symbols are made up of modules arranged within a perimeter finder and timing pattern. It can encode up to 3,116 characters from the entire ASCII character set (with extensions). The symbol consists of data regions which contain modules set out in a regular array. Large symbols contain several regions. Each data region is delimited by a finder pattern, and this is surrounded on all four sides by a quiet zone border (margin). (Note: The modules may be round or square- no specific shape is defined in the standard. For example, dot-peened cells are generally round.)
ECC 200 is the newest version of Data Matrix and supports advanced encoding error checking and correction algorithms (such as Reed-Solomon). ECC 200 allows the routine reconstruction of the entire encoded data string when the symbol has sustained 30% damage, assuming the matrix can still be accurately located. Data Matrix has an error rate of less than 1 in 10 million characters scanned.[3]
Symbols have an even number of rows and an even number of columns. Most of the symbols are square with sizes from 10×10 to 144×144. Some symbols however are rectangular with sizes from 8×18 to 16×48 (even values only). All symbols utilizing the ECC 200 error correction can be recognized by the upper right corner module being the same as the background color. (binary 0).
Additional capabilities that differentiate ECC 200 symbols from the earlier standards include:
Older versions of Data Matrix include ECC 000, ECC 050, ECC 080, ECC 100, ECC 140. Each of these varies in the amount of error correction they offer, with ECC 000 offering none, and ECC 140 offering the greatest. These older versions always have an odd number of modules, and can be made in sizes ranging from 9×9 to 49×49. All symbols utilizing the ECC 000 through 140 error correction can be recognized by the upper right corner module being the inverse of the background color. (binary 1).
According to ISO/IEC 16022, "ECC 000 - 140 should only be used in closed applications where a single party controls both the production and reading of the symbols and is responsible for overall system performance."
Data Matrix codes are becoming common on printed media such as labels and letters. The code can be read quickly by a barcode reader which allows the media to be tracked, for example when a parcel has been dispatched to the recipient.
For industrial engineering purposes, Data Matrix codes can be marked directly onto components, ensuring that only the intended component is identified with the Data Matrix encoded data. The codes can be marked onto components with various methods, but within the aerospace industry these are commonly industrial ink-jet, dot-peen marking, laser marking, and electrolytic chemical etching (ECE). These methods give a permanent mark which should last the lifetime of the component.
After creation of the Data Matrix code, the code is usually verified using specialist camera equipment and software. This verification ensures the code conforms to the relevant standards, and ensures it will be readable for the lifetime of the component. After the component enters service, the Data Matrix code can then be read by a reader camera, which decodes the Data Matrix data which can then be used for a number of purposes, such as movement tracking or inventory stock checks.
Data Matrix codes, along with other Open Source codes such as 1D Barcodes can also now be read with mobile phones, simply by downloading the application to compatible mobile phones. Although the majority of these mobile readers are capable of reading Data Matrix, only a few can extend the decoding to enable mobile access and interaction, whereupon the codes can be used securely and across media; for example, in track and trace, anti-counterfeit, e.govt, and banking solutions.
Data Matrix was invented by International Data Matrix, Inc. (ID Matrix) which was merged into RVSI/Acuity CiMatrix, who were acquired by Siemens AG in October, 2005 and Microscan Systems in September 2008. Data Matrix is covered today by several ISO/IEC standards and is in the public domain for many applications, which means it can be used free of any licensing or royalties.
Although this is a free standard, there are no free documents that explain the encoding process. Documentation in PDF or paper format can be purchased from the ISO web site[4]
The diagram below illustrates the placement of the message data within a Data Matrix symbol. The message is "Wikipedia", and it is arranged in a somewhat complicated diagonal pattern starting near the upper-left corner. Some characters are split in two pieces, such as the initial W. Also shown are the end-of-message code (marked End), the padding (P) and error correction (E) bytes, and four modules of unused space (X).
There are multiple encoding modes used to store different kinds of messages. The default mode stores one ASCII character per 8-bit codeword. Control codes are provided to switch between modes, as shown below.
Codeword | Interpretation |
---|---|
0 | Not used |
1 – 128 | ASCII data (ASCII value + 1) |
129 | End of message |
130 – 229 | Digit pairs 00 – 99 |
230 | Begin C40 encoding |
231 | Begin Base 256 encoding |
232 | FNC1 |
233 | Structured append. Allows a message to be split across multiple symbols. |
234 | Reader programming |
235 | Set high bit of the following character |
236 | 05 Macro |
237 | 06 Macro |
238 | Begin ANSI X12 encoding |
239 | Begin Text encoding |
240 | Begin EDIFACT encoding |
241 | Extended Channel Interpretation code |
242 – 255 | Not used |
The C40, Text and X12 modes are potentially more compact for storing text messages. They use character codes in the range 0–39, and three of these codes are packed into two bytes as follows.
The resulting value of B1 is in the range 0–249. The special value 254 is used to return to ASCII encoding mode.
Character code interpretations are shown in the table below. The C40 and Text modes have four separate sets. Set 0 is the default, and contains codes that temporarily select a different set for the next character. Set 1 contains ASCII control codes, while set 2 contains punctuation symbols; these sets are identical in C40 and Text mode.
Code | C40 | Text | X12 | ||||
---|---|---|---|---|---|---|---|
set 0 | set 1 | set 2 | set 3 | set 0 | set 3 | ||
0 | set 1 | NUL | ! | ' | set 1 | ' | CR |
1 | set 2 | SOH | " | a | set 2 | A | * |
2 | set 3 | STX | # | b | set 3 | B | > |
3 | space | ETX | $ | c | space | C | space |
4 | 0 | EOT | % | d | 0 | D | 0 |
5 | 1 | ENQ | & | e | 1 | E | 1 |
6 | 2 | ACK | ‘ | f | 2 | F | 2 |
7 | 3 | BEL | ( | g | 3 | G | 3 |
8 | 4 | BS | ) | h | 4 | H | 4 |
9 | 5 | HT | * | i | 5 | I | 5 |
10 | 6 | LF | + | j | 6 | J | 6 |
11 | 7 | VT | , | k | 7 | K | 7 |
12 | 8 | FF | - | l | 8 | L | 8 |
13 | 9 | CR | . | m | 9 | M | 9 |
14 | A | SO | / | n | a | N | A |
15 | B | SI | : | o | b | O | B |
16 | C | DLE | ; | p | c | P | C |
17 | D | DC1 | < | q | d | Q | D |
18 | E | DC2 | = | r | e | R | E |
19 | F | DC3 | > | s | f | S | F |
20 | G | DC4 | ? | t | g | T | G |
21 | H | NAK | @ | u | h | U | H |
22 | I | SYN | [ | v | i | V | I |
23 | J | ETB | \ | w | j | W | J |
24 | K | CAN | ] | x | k | X | K |
25 | L | EM | ^ | y | l | Y | L |
26 | M | SUB | _ | z | m | Z | M |
27 | N | ESC | FNC1 | { | n | { | N |
28 | O | FS | | | o | | | O | |
29 | P | GS | } | p | } | P | |
30 | Q | RS | hibit | ~ | q | ~ | Q |
31 | R | US | DEL | r | DEL | R | |
32 | S | s | S | ||||
33 | T | t | T | ||||
34 | U | u | U | ||||
35 | V | v | V | ||||
36 | W | w | W | ||||
37 | X | x | X | ||||
38 | Y | y | Y | ||||
39 | Z | z | Z |
EDIFACT mode uses six bits per character, with four characters packed into three bytes. It can store digits, upper-case letters, and many punctuation marks, but has no support for lower-case letters.
Code | Meaning |
---|---|
0 – 30 | ASCII codes 64 – 94 |
31 | Return to ASCII mode |
32 – 63 | ASCII codes 32 – 63 |
Base 256 mode data starts with a length indicator, followed by a number of data bytes. A length of 1 to 249 is encoded as a single byte, and longer lengths are stored as two bytes.
It is desirable to avoid long strings of zeros in the coded message, because they become large blank areas in the Data Matrix symbol, which may cause a scanner to lose synchronization. (The default ASCII encoding does not use zero for this reason.) In order to make that less likely, the length and data bytes are obscured by adding a pseudorandom value R(n), where n is the position in the byte stream.
Prior to the expiration of U.S. Patent 5,612,524, intellectual property company Acacia Technologies claimed that Data Matrix was partially covered by its contents. As the patent owner, Acacia allegedly contacted Data Matrix users demanding license fees related to the patent.
Cognex Corporation, a large manufacturer of 2D barcode devices, filed a declaratory judgment complaint on March 13, 2006 after receiving information that Acacia had contacted its customers demanding licensing fees. On May 19, 2008 Judge Joan N. Ericksen of the U.S. District Court in Minnesota ruled in favor of Cognex. The ruling held that the '524 patent, which claimed to cover a system for capturing and reading 2D symbology codes, is both invalid and unenforceable due to inequitable conduct by the defendants during the procurement of the patent.
Notably, since the '524 patent expired in November 2007, a ruling against Cognex wouldn't have affected current use of Data Matrix codes. However, it would have established that use of Data Matrix prior to November 2007 could potentially be covered by the '524 patent.
A German Patent Application DE 4107020 was filed in 1991, and published in 1992. This patent is not cited in the above US patent applications and might invalidate them.
In May 2006 a German computer programmer, Bernd Hopfengärtner, created a large data matrix in a wheat field (in a fashion similar to crop circles). The message read "Hello, World!".[5]
|